Machine learning-empowered sleep staging classification using multi-modality signals

The goal is to enhance an automated sleep staging system's performance by leveraging the diverse signals captured through multi-modal polysomnography recordings. Three modalities of PSG signals, namely electroencephalogram (EEG), electrooculogram (EOG), and electromyogram (EMG), were considered to obtain the optimal fusions of the PSG signals, where 63 features were extracted. These include frequency-based, time-based, statistical-based, entropy-based, and non-linear-based features. We adopted the ReliefF (ReF) feature selection algorithms to find the suitable parts for each signal and superposition of PSG signals. Twelve top features were selected while correlated with the extracted feature sets' sleep stages. The selected features were fed into the AdaBoost with Random Forest (ADB + RF) classifier to validate the chosen segments and classify the sleep stages. This study's experiments were investigated by obtaining two testing schemes: epoch-wise testing and subject-wise testing. The suggested research was conducted using three publicly available datasets: ISRUC-Sleep subgroup1 (ISRUC-SG1), sleep-EDF(S-EDF), Physio bank CAP sleep database (PB-CAPSDB), and S-EDF-78 respectively. This work demonstrated that the proposed fusion strategy overestimates the common individual usage of PSG signals.


Introduction
Sleep is a fundamental necessity for humans, crucial for maintaining physical and mental well-being [1].Inadequate sleep patterns have been observed to lead to difficulties in learning, concentration, and decision-making and can impact social interactions.Prolonged adherence to such sleep behaviors may result in various sleep disorders.Notably, certain sleep disorders, like obstructive sleep apnea (OSA) [2], have direct or indirect associations with chronic diseases, such as an increased risk of stroke [3].Additionally, insomnia has been linked to conditions like diabetes and cardiovascular diseases [4].Therefore, assessing sleep quality and employing proper diagnostic procedures to address diverse sleep issues for overall health is imperative.Two main standards, R&K and AASM guidelines, examine sleep patterns and their attributes.changes reflecting the measurement and treatment of S3 and S4 as part of the N3 stage [5].
Experts commonly employ the Polysomnography (PSG) test to assess different types of sleep disorders in subjects.PSG signals typically include an electroencephalogram (EEG) [6], electrocardiogram (ECG) [6], electrooculogram (EOG) [7], and electromyogram (EMG) [8].These signals are recorded and analyzed visually by experts.The process involves at least two experts, one interpreting the signal waveforms while the other annotating them [9].In the traditional diagnostic approach, manual inspection is used to observe and label the subject's sleep behavior.However, this method often yields lower performance due to variations in labeling and annotation skills among experts [10].Additionally, reaching a consensus on sleep stage labels between the two experts can be challenging.As a result, many automated sleep staging systems have been developed to analyze sleep stages based on various sleep disorders, aiming to automate the scoring of sleep stages [11].Figure 1 illustrates the EEG pattern of sleep stages.The depicted sleep EEG behavior is from subject id-61, a 61-year-old male, sourced from the Physio Bank CAP Sleep (PB-CAPSD) database [12].This particular subject experienced periodic limb movement disorder.The figure highlights distinct EEG behaviors associated with each sleep stage, annotated to showcase their waveform characteristics.The N1 stage represents a transitional phase between light and deep sleep.In this stage, the EEG predominantly contains alpha waveforms, constituting about 2-5% of total sleep.Moving to stage N2, waveforms such as sleep spindles and k-complexes are prevalent, covering approximately 40-60% of total sleep for one subject [12].Finally, the REM stage behavior closely resembles the wake stage, featuring sawtooth Fig. 1 EEG patterns with the different sleep stages waves with alpha and theta activities [13].The interconnected changes in sleep behavior during transitions between stages play a vital role in studying mental and physical health.Individuals with various sleep disorders often deviate from a regular sleep cycle [14].
Therefore, classifying sleep stages, particularly N1 or an extended transition period like N2, is crucial for identifying irregularities during sleep.In routine practice, sleep experts traditionally manually record multiple EEG signals and label them with corresponding sleep stages, making the entire process labor-intensive, time-consuming, and costly [15].
In the intersection of brainwave analysis and machine learning, extracting features from EEG signals plays a pivotal role.Wavelet transform, for instance, can analyze signals at multiple scales, making it valuable for detecting episodic events or signal changes over time.This characteristic renders it suitable for identifying changes in EEG signals, such as sudden increases or decreases in activity, which may be associated with specific events.This suggests a promising avenue for leveraging machine learning techniques to enhance the accuracy of sleep pattern analysis.
Despite the successes seen with both single and multimodal sleep staging methods, several notable drawbacks persist: i) A generalized framework adaptable for the classification task from the conventional five-stage to twostage sleep stages is lacking.ii) Supervised classification models, while effective with known data, may struggle with new records and can misclassify significant sleep stage patterns.Additionally, the features extracted from these models may be limited and fail to capture the complexity of the original signals adequately.iii) Misclassification of several epochs as belonging to either N1 or REM stages has been observed, directly impacting the accuracy performance of sleep staging algorithms.
This study aims to leverage multi-modal signal fusions and apply them using machine learning techniques to overcome the limitations of traditional methods in sleep scoring.The objective is to enhance the consistency of polysomnography scoring and develop classifiers with high accuracy for each sleep stage.

Related research
Over the years, the researchers developed different sleep staging methods based on machine learning and deep learning techniques.Most studies can be categorized into i) single-channel-based and multi-channel-based methods.In [16] the authors analyzed the sleep characteristics epochs that were pooled, then screened the features and selected the most suitable features based on relevance.In [17], the authors employed a band-pass filter during pre-processing to eliminate artifacts from the data.Their method yielded superior outcomes compared to existing procedures.Specifically, their approach proved effective for detecting dishonesty in EEG-based Brain-Computer Interface (BCI) systems.
In [18], the authors employed an orthogonal convolutional neural network (OCNN) to extract features from recorded polysomnography signals.They conducted their analysis on two publicly available sleep datasets from UCD and MIT-BIH.The OCNN model achieved accuracies of 88.4% and 87.6% with the UCD and MIT-BIH datasets, respectively.In [19], the author employed multi-modal classification and decision-making systems for sleep staging, incorporating an external neural network.The experimental work utilized the CAP sleep dataset, and the results indicated that the model performed well compared to an individual CNN model.The proposed model achieved a high accuracy of 95.43% for the six-class classification problem.In [20], the author introduced a novel approach for automated scoring of different stages of sleep using EEG signals collected from a single channel.This method utilized a unique cascaded recurrent neural network (RNN) architecture.The EEG data underwent preprocessing 55 times, and frequency-domain features were extracted, with the most relevant features selected via feature reduction techniques.Overall, the model achieved a classification accuracy of 86.7% for the five stages of sleep.The primary focus of this effort was to improve classification performance in sleep stage N1, with the aim of achieving satisfactory results in the remaining sleep stages as well.In reference [21], a novel method for automatic sleep stage categorization using EEG information from a single channel was proposed.The main idea is to directly apply the raw EEG signal to a deep convolutional neural network (CNN), bypassing the traditional feature extraction and selection process used in previous approaches.The suggested network architecture consists of nine convolutional layers followed by two fully connected layers.The proposed method achieved an accuracy above 90% for categorizing two to six classes, representing an improvement over existing methods.Additionally, Cohen's Kappa coefficients were reported as 0.98, 0.94, 0.90, 0.86, and 0.89, respectively, indicating strong agreement between predicted and actual sleep stages.In [22], the author utilized the concept of a weighted undirected network by mapping the feature vector into it.This network's various structural and spectral characteristics were separated.In [23] the author used multi-scale deep neural architectures, in which the decomposed signals were input into the CNN model for further analysis of the sleep patterns.The model resulted in an accuracy of 80.7% using S-EDF and 86.5% with the MASS dataset.In [24], the author used semi-supervised learning techniques for a better presentation of EEG signals for sleep staging.The author used two public datasets for this research work.The model received accuracy of 70.01% and 50.36% with S-EDF and ISRUC-Sleep datasets respectively.In [25], the author introduced a lightweight automated sleep staging system designed specifically for children, utilizing a single-channel EEG signal.The author combined Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models for classifying sleep stages.The experiments were conducted using two datasets: a children's sleep dataset and the Sleep-EDFx dataset.The system achieved an accuracy of 83.06% with the children's sleep dataset using the F4-M1 channel and 86.41% with the Sleep-EDFx dataset with manual feature extraction.In [26], the authors used multi-branch one-dimensional convolutional neural networks and extracted different frequency domain features from single-channel EEG data.The model resulted from 90.31% accuracy, 95.30% specificity, and 65.73% F1score.In reference [26], the authors employed multi-branch one-dimensional convolutional neural networks (CNNs) and extracted various frequency domain features and achieved an accuracy of 90.31%, specificity of 95.30%, and an F1 score of 65.73%.Some of the recent studies on sleep staging are presented in Table 1.
This research proposed a multi-modal machine learning model aimed at identifying changes in characteristics across individual sleep stages during sleep hours.The model achieves this by fusing multi-modal signals to classify sleep patterns [36].It has been observed that the EEG signal is the most effective for robust sleep staging analysis.However, accurately analyzing changes in sleep behavior across individual sleep stages remains challenging [37].The EMG and EOG signals can be acquired and recorded relatively easily, and there is evidence demonstrating a correlation between EEG, EMG, and EOG signals during sleep [38,39].The objective is to enhance the consistency in polysomnography scoring and to develop classifiers with high accuracy for each stage of sleep.Recognizing the substantial influence that various sleep stages exert on arousal, our research seeks to address the gap in existing studies by investigating different irregularities [40].
The notable advancements made by this research investigation are summarized as follows: • Development of an automated sleep staging system by integrating three modalities of polysomnography signals.The complete research study is presented in six sections.In the first section, the importance of sleep is briefly discussed.The second section presented related studies on sleep staging.The third section briefly presented the proposed methodology.The fourth section illustrates experimental and simulation result analysis.The fifth section delves into the results obtained and compares them with existing relevant research contributions.Finally, the last section concludes with this research work.

Materials and methodology
In our investigation, we have combined AdaBoost with a foundational classifier called Random Forest (RF) for the classification of sleep stages.RF enhances the model's prediction variance by utilizing bootstrap sampling and selecting features via the ReliefF feature selection algorithm.Meanwhile, AdaBoost addresses the model's prediction bias by optimizing residuals.Consequently, by leveraging the strengths of these two algorithms, our research aims to enhance the performance of sleep staging in the model.This research also investigates the impact of age on sleep behavior, a factor often overlooked in recent studies.However, it's been noted that there exists a direct correlation between sleep patterns and the age of the subject.This insight is crucial for understanding variations in sleep characteristics across different stages.Recent contributions in sleep stage analysis have been critiqued for overlooking crucial aspects.For instance, they often neglect to consider the age factor when analyzing sleep behavior and fail to address imbalances in sleep epochs across different stages.Additionally, many studies overlook infrequent sleep stage transitions, such as subjects transitioning directly from wakefulness to deep sleep, particularly among healthy individuals [41].The model is developed using multi-modal PSG signals.The complete framework of this research work is explained in Fig. 2.

Sleep stages classes
According to the sleep rules established by R&K (Rechtschaffen and Kales) and AASM (American Academy of Sleep Medicine), sleep stages can be classified into two to six distinct classes.Details of sleep stage classification problems considered in this study are shown in Table 2.

Data description ISRUC-Sleep subgroup1 database (ISRUC-SG1)
In this study, ISRUC-Sleep datasets were used, comprising sleep recordings from subjects having distinct medical conditions and affected by various types of sleep issues.These recordings were collected at the Hospital of Coimbra University from 200 to 2013 [41].In the present work, 18 subjects were considered, among which 15 are male subjects and 4 female subjects, having an age range between 22-76 years.

Sleep-EDF database (S-EDF)
A whole of 8 Caucasian subjects' sleep recordings were collected.The collected recordings are mainly categorized into SC* and ST*.The SC* contained four subject recordings from healthy subjects.The ST* categories had four subjects with mild sleep problems.One EEG signal (Fpz-Cz), one EOG, and one EMG signal were recorded for category subjects [42].

Physio Bank CAP Sleep (PB-CAPSD) database
This dataset contained 108 polysomnographic recordings (CAP Sleep Database) [43].This dataset collected EEG, EOG, EMG channels, and other electrophysiological signals.The detailed descriptions of this dataset were given in [43].This research work retrieved PSG signals from six healthy subjects aged 23 to 37 years.The average period of sleep time for each subject is 8.5.The entire overnight polysomnography recordings were processed under the R&K rules.The number of subject recordings present in a particular dataset as classified into different sleep stages is presented in Table 3 below.

Sleep-EDF-78 dataset
Sleep-EDF-78 is an expanded version of Sleep-EDF-20, comprising 197 overnight polysomnography (PSG) recordings.It includes annotated sleep stage information from 20 healthy subjects and 58 subjects experiencing mild sleep difficulties.The subjects range in age from 25 to 101 years, with 41 male and 37 female participants.These recordings feature various physiological signals, including EEG, EOG, and EMG [44].Generally, two different types of methods are more popular concerning clinical data; that is subject-wise (Subject-Independent Test) and epoch-wise (Subject-Dependent Test) (Fig. 3).This article uses the subject-wise and epoch-wise analysis methods on the ISRUC-SG1, S-EDF, and PB-CAPSD databases.Figure 4a-c presents the PSG signals recorded from the ISRUC-Sleep dataset of subject-5 with the 30 s of each sleep stage, including Wake, N1, N2, N3, and REM stages recorded on a subject affected by a small airway obstruction syndrome.In this case, the subject sleep cycle is continuously disturbed, and finds brief arousals in sleep, which causes the deprivation of REM and N3 sleep.Similarly, Fig. 5a-c presents the subject's sleep stages behavior recorded from the Sleep-EDF dataset of subject-sc4002e0, which was wholly healthy and controlled, with no sleep problems in earlier days.

Preprocessing
In this research, artifacts were eliminated by employing a 10th-order Butterworth bandpass filter spanning frequencies from 0.5 to 49.5 Hz [45].Generally, the raw signals are highly contaminated with different artifacts and irrelevant noises, which is difficult to process directly.To eliminate noise and artifacts, notch filtering, a high-pass filter with a cut-off frequency of 0.3 Hz, and a low-pass filter with a cut-off frequency of 30 Hz were applied to the EEG and EOG signals [46,47].To process the EMG signal, notch filtering, a high-pass filter with a cutoff frequency of 10 Hz, and a low-pass filter with a cutoff frequency of 75 Hz were utilized [48,49].All the preprocessing is performed through the MATLAB signal processing toolbox using digital filtering techniques.

Features extraction
Feature analysis is of utmost importance to analyze subjects' behavior to determine the parameters that significantly decide the classified stage [50].The characteristics during sleep that are strongly correlated to the location that a particular epoch of sleep duration belongs to can be observed through feature extraction, which becomes even more necessary when the signals are highly random and unstable, as is the case with polysomnography signals [51][52][53].The obtained features from different physiological signals are presented in Tables 4, 5, and 6, respectively.

Feature normalization
After the feature extraction, a feature set with the dimensions of 16266 × 63, 15139 × 63, and 6047 × 63 for multi-modal PSG signals using ISRUC-SG1, S-EDF, and PB-CAPSD, respectively.Generally, the subject's data for both the baseline and the time series information have different orders of magnitude.Train the ML-based classification model makes converging difficult [51,54].To confirm that every feature data has to be level of the same standards, feature values were standardized using the z-score method.Zero mean and unit variance have been used here, after which a normalized feature vector is generated.This, in general, boosts the system's performance and helps to remove the outliers.

Feature reduction
It is also one of the critical steps during the sleep staging process.It has been found that sometimes improper signal fusions may degrade the model's performance.
For this reason, it's essential to screen the best convenient feature, which helps discriminate the parts based on their characteristic changes over the individual sleep stages [55].This study employs the ReliefF (ReF), a supervised feature weighting algorithm, to extract relevant features.The extracted features and their corresponding weights are presented in Tables 7, 8, 9, and 10, respectively.The AdaBoost meta-learning method is fed with the features from the ReF algorithm to produce baselearner random forest classifiers for accuracy improvement & mitigation of overfiring issues [56].AdaBoost reinforces any base classification problem by boosting its accuracy.This approach is foolproof, simple, and convenient; it is rated much higher than its counterparts.Besides this, it has the added advantage of being non-parametric and performs much more reliably in figuring out the outlier information from training samples [57,58].One of the standout features of this algorithm is that it is agnostic of the presence of any weak learners.Hence it finds presence across many classification problems.The algorithm here is being fed with a training dataset TD which ranges over n sample values i.e.T D = (X i , Y i ) for i = 1, 2,…N; The variables X i & Y i = {0,1,2,3,5} represent feature vector and its labels respectively.The class labels 5,3,2,1 and 0 correspond to the REM, N3, N2, N1, and WAKE stages, respectively.Followed by this, the base level classification models is called over several times.The weak hypotheses are linearly combined to construct the final view at each round.Random Forest (RF) is one of the most acceptable methodologies in classification, easily head-and-shoulders above its compatriots.It is one of the superiors among the various Bagging techniques [59].The standout feature of this methodology is its ability to process massive datasets smartly and its capability to deal with large volumes of input variables without data loss and seamlessly characterizing the features of classification.Besides this, its ability to manage outliers and noise data are noteworthy.This algorithm is nothing but an aggregation of classifiers in an efficient tree structure.Each of the participating trees independently contains random sample values [60].This is suitable for all other trees of the forest as well.The predictive results are derived using voting at each step, and subsequently, the highest voted predictive effect becomes the final prediction result.
Random Forest of AdaBoost algorithm is taken as the base classifier to classify the sleep stage.This duo has only ensured higher classification accuracy for all the sleep stages.The below algorithm presents their correlation as following Algorithm 3:

Testing schemes
Epoch-wise Test (Subject Dependent Test) In this testing scheme, tenfold cross-validation, considers all the samples to be mixed to evaluate the proposed model's performance.During this test procedure, both the training and testing samples were obtained from the same subject.So, the performance of this testing scheme may be overly optimistic and incomparable to the subjectwise analysis.

Subject-wise Test (Subject Independent Test)
During this testing method, we have obtained a crossvalidation strategy to assign one set of data is considered as testing data while the others are treated as training datasets via 10-fold cross-validation.This testing procedure was repeated K times for K subjects.Each subject's data is used to consider as the test in turn whereas other K-1 subjects' data are considered for training the proposed classification model.

Performance evaluation metrics
In this section, the performance of the model is measured using five different standard metrics such as accuracy (ACC) [61], sensitivity (SEN), specificity (SPC) [61], precision (PRE) [62], F1Score (F1Sc) [63], and Cohen's Kappa Score [64].We used three public datasets such as ISRUC-SG1, S-EDF, and PB-CAPSDB datasets under AASM rules to assess the model's efficiency better.This proposed study executes six individual experiments using multi-modal PSG signals based on two different testing schemes: epoch-wise and subject-wise.Table 10 presented the brief settings for all the experiments and all the experiments based on the two-class to five-class sleep stages classification.A total of 63 features were extracted, which includes 1-30 from EEG features, 31-46 from EOG features, and 47-62 from EMG features, respectively.

Performance evaluation of proposed sleep staging model using an individual feature
To identify the impacts of the screened features of the EEG, EOG, and EMG signal for sleep staging, we investigate the individual features under the AASM sleep standards.The sleep staging performance will be analyzed on the basis of a single feature using the same datasets, testing schemes, and proposed classification model.We extracted 30 features from EEG signals (See Table 3), 16 features from EOG (See Table 4), and EMG (See Table 5) signals, respectively.Finally, the selection of suitable features based on the features' weight value signifies more suitability.The sorted features using the ReF feature selection algorithm for EEG, EOG, and EMG signals are presented in Tables 7, 8, and 9, respectively.The best top 12 extracted features from EEG, EOG, and EMG signals were presented in Table 10 for sleep staging.The selected features were fed one by one From Table 11, it has been observed that single EEG features using sleep staging are not performed well.The highest result achieved using the SE feature as 72.79%, and the lowest performance was reported as 38.10% using RSP_alpha based on epoch-wise testing scheme and similarly, the accuracy reported based on subject-wise for SE (69.66%) and RSP_alpha (34.76%) respectively.
It has been found from Table 12 that the same classification model reported the highest accuracy with the SE feature (74.79%) and lowest accuracy with the ME feature (28.10%) based on epoch-wise testing and similarly SE (69.58%) and M (25.12%) based on subject-wise testing using the top 12 selected features of the EOG signal.
From Table 13, it is noted that the sleep staging performance reached its peak with the SPE feature, achieving an accuracy of 44.19% and a sensitivity of 40.76% and lowest with the HA feature (26.77%) (24.02%) based on the epoch-wise and subject-wise testing schemes.Finally, it has been seen from Tables 10,11, 12 that the performance of the sleep staging is relatively low since a single feature could only partially discriminate the sleep stages.Although single-channel and single-feature may experience this challenge during sleep staging, on the other hand, their combinations of the signals and features may better perform.

Performance evaluation of the proposed sleep staging model using multi-modal signal fusions
In this section, the effectiveness of multi-modal signal fusions is analyzed during sleep staging.Six individual experiments are performed using three widely accepted public datasets as ISRUC-SG1, S-EDF, and PB-CAPSD.The brief experiment settings are presented in Table 14, and all the experiments are based on the classification of the five-sleep state.Experiment-1 to Experiment-3 use epoch-wise and Experiment-4 to Experiment-6 use subject-wise testing      23.

Analysis of the sleep staging performance using Subject-wise (Experiment-5 to Experiment-8)
Here also we considered the same three public datasets for all the four experiments, the same multi-modal of signal features, and the only changes are here testing scheme that is subject-wise analysis.The other parameters remained the same as the earlier experiments of this study.The reported confusion matrix for Experiment-5 to Experiment-8 using

Analysis of sleep staging classification performance using single-channel and multi-modal signals fusions
This analysis was done through the same three datasets in both the testing procedures (epoch-wise and subject-wise).
The overall accuracy performance for 2C to 5C classification problems using individual and multi-modal signal fusions using the epoch-wise testing method is shown in Figs. 6, 7, and 8. Similarly, the reported graph performance results using subject-wise testing procedures are shown in Figs. 9, 10, and 11 with ISRUC-SG1, S-EDF, and PB-CAPSDB, respectively.It has been noticed from the above presented graphical results that the overall accuracy performances are improved with combinations of the

Discussion
Several studies on sleep staging methods were generally focused upon the classification methods .Some of the sleep studies were based on traditional time-frequency analysis using machine learning techniques [20][21][22][23][24][25][26][27][28][29][30][31][32][33], and the deep learning techniques [34,37,39,40,51,[55][56][57][58][59][60][61][62][63][64].Generally, during the sleep studies, several kinds of signals were recorded for analyzing the changes in sleep characteristics during sleep.Generally, it is preferable and advantageous for considering the multi-modal of signal fusions during sleep quality assessment incomparable to the individual signals [65][66][67][68][69][70][71][72][73][74][75][76][77][78][79].By the experiments results using both the testing schemes and subject-wise), it is concluded that the multi-modal of signal fusions can be discriminating the sleep stages by AdaBoost with base classifier as RF in Fig. 7 The overall accuracies performances of two-five sleep stages classification with the S-EDF dataset Fig. 8 overall accuracies performances of two-five sleep stages classification with ISRUC-SG1 dataset acceptable level.Therefore, this proposed methodology is much more effective than other machine learning models.The effectiveness of the multi-modal of signals using epoch-wise and subject-wise testing during sleep staging was illustrated in Tables 23 and 32, respectively.The sleep staging accuracy performances for two-five sleep classes' problems using individual and multi-modal signal fusions were illustrated in Figs. 6, 7, 8 and Figs. 9, 10, 11 based on epoch-wise and subjectwise testing, respectively.More specifically, the proposed multi-modal signal fusions (EEG + EOG + EMG) contain valuable information regarding changes in sleep characteristics during sleep periods.EEG signals capture the brain's information and its activities during sleep, and it also helps to study the changes in rhythm (alpha, delta, theta, and beta) during the different sleep stages [80][81][82][83].Similarly, the EOG recorded the eye movement information, recognizing the W and REM stages [84][85][86].EMG signals obtained information about muscular activity, and it has been found that the higher muscular behavior seen during the W stage is incomparable REM stages.This information helps to discriminate the W and REM stages properly.Therefore, three modalities of signal fusions (EEG + EOG + EMG) signals to help discriminate the NREM sleep stages (N1, N2, and N3) and extracted multi-modal signal features that support discriminating the sleep status in the various aspect, which directly contributes to the improvement on sleep staging accuracy Fig. 9 Overall accuracies performances two-five sleep stages classification are compared between using single-channel and multi-modal of signals fusions using subject-wise testing procedures with ISRUC-SG1 dataset Fig. 10 Overall accuracies performances of two-five sleep stages classification are compared in between using single-channel and multi-modal of signals fusions using subject-wise testing procedures with S-EDF dataset [59][60][61][62][63][64][65].The proposed study investigated 63 features (time-domain, frequency-domain, and non-linear features) from the polysomnography signals' three modalities (EEG, EOG, and EMG).The selected joint optimal features were applied to the proposed classification model (AdaBoost with RF).For measuring the proposed methodology's effectiveness, both the testing procedures (epoch-wise and subject-wise) were adopted in our experiments.The present study was performed on three widely used datasets such as ISRUC-SG1, S-EDF, and PB-CAPSDB to analyze the effectiveness of the proposed methodology.
The required recordings were retrieved from the subjects who had difficulty sleeping and subjects with complete healthy control.To see the effectiveness of the sleep staging performance, the classification results provided by the proposed model(automatically) and manual staging are shown in Figs. 12, 13, and 14, where the hypnograms of ISRUC-Sleep-SG1, S-EDF, and PB-CAPSDB datasets are utilized.

Complexity comparison with other approaches
Many researchers have proposed two to five sleep state classification problems so far.For measuring the effectiveness of our proposed methodology during sleep staging, very brief comparisons were made with the existing state-of-the-art sleep staging methods with other state-of-the-art models.Figures 12,13 and 14 illustrates the comparison between labels manually acquired by sleep experts and those predicted by the proposed method using one-night data records.Figures 13 and 14 depicts the sleep data, revealing that the subject experienced approximately 6 h of effective sleep time.The individual entered a deep sleep phase shortly after initially falling asleep.Despite numerous instances of waking up during the sleep period, the subject promptly transitioned back into a sleep state after each awakening.We also compared the results with the latest published research works like LGSleepNet [5] SleepEEGNet [79], TinySleepNet [87], XSleepNet [88], CoSleepNet [89], SSleepNet [90], and RobustSleepNet [91] based on the multi-modal signal fusions and the same dataset.As observed in Tables 33, 34, and 35, our proposed sleep staging classification method has demonstrated superior performance compared to other methods across three datasets.From the comparison analysis, it has been found that the proposed multimodal signal fusions performed high sleep staging classification accuracy.The reported overall accuracy and Cohen's kappa score reported as 94.30%,0.92(using ISRUC-SG1),94.18%,0.90(using S-EDF), 92.34%,0.90(using PB-CAPSDB) for five-class (5C) classification problem using epoch-wise analysis.Similarly, the same proposed model reported as 91.37%, 0.89 (using ISRUC-SG1), 91.08%, 0.87 (S-EDF), 91.55%, 0.89 (PB-CAPSD) using subject-wise analysis.Tables 23 and 32 demonstrate that the kappa score surpasses 0.80 using    both testing procedures, indicating excellent agreement between manual and automatic scoring.Generally, in sleep staging, it's quite complicated and challenging towards discriminating between the N1 stage because it is the transition stage in between the Wake stage and the N2 stage.But it is noticed that the performance of the SEN-N1 stage is improved using subject-wise analysis with ISRUC-SG1 (93.18%),S-EDF (66.95%), and PB-CAPSDB (86.62%) incomparable to result reported using epoch-wise testing.

Computation time analysis
Another crucial factor for assessing a classifier is the computation time, although the training time is not taken into account during this analysis.During the initial phase, the computation time for each stage of the proposed ADB + RF method is logged, followed by the computation of the average value.

Conclusion
Accurate and effective sleep staging is highly important step for analysis and identifying the sleep irregularities.To develop a highly accurate and robust automatic sleep staging system, this paper presents a computeraided R&K rules categorize the entire sleep cycle into seven stages, including Wake (W), Stage1 (S1), Stage2 (S2), Stage3 (S3), Stage4 (S4), Rapid Eye Movement (REM), and movement time.Stages S1 to S4 are considered non-REM sleep stages.In later research, the American Academy of Sleep Medicine (AASM) introduced updated guidelines, consolidating the sleep cycle into five stages: Wakefulness (W), N1, N2, and N3, with Page 2 of 29 Satapathy et al.BMC Medical Informatics and Decision Making (2024) 24:119

Fig. 2 2
Fig. 2 The complete layout of the proposed research work

Fig. 3 Fig. 5
Fig. 3 Training and Testing data partitioning methods: a subject-wise, b epoch-wise

Fig. 11 Fig. 12
Fig. 11 Overall accuracies performances of two-five sleep stages classification are compared between using single-channel and multi-modal signals fusions using subject-wise testing procedures with the PB-CAPSDB dataset

Fig. 14
Fig. 14 Comparison of hypnogram annotation of manual sleep staging (green) and proposed sleep staging method (purple) with PB-CAPSDB dataset sleep staging system capable of classifying two to five sleep states using multimodal signal fusion of polysomnography (PSG) signals following the AASM sleep scoring guidelines.The proposed approach involves extracting multiple features, including time-based, frequency-based, statistical-based, entropy-based, and non-linear features, from three modalities (EEG, EOG, and EMG) of PSG signals.The model is evaluated on three widely accepted datasets: ISRUC-SG1, S-EDF, and PB-CAPSDB.These datasets include sleep recordings from subjects affected by various types of sleep-related disorders as well as healthy control subjects, totaling 16,266 epochs (ISRUC-SG1), 15,139 epochs (S-EDF), and 6,047 epochs (PB-CAPSDB) of 30-s length each.In

Table 1
Recent research works carried out on automated sleep stage classification using EEG and PSG signals

Table 3
Description of distribution of sleep stages

Table 4
Extracted features from EEG Signal

Table 5
Extracted features from EOG signal

Table 6
Extracted features from EMG signal

Table 7
EEG features with their ReliefF weights

Table 8
EOG features with their ReliefF weights

Table 9
EMG features with their ReliefF weightsOnly Polysomnography (PSG) signals have been considered for this work, which is a combination of the physiological signals pertaining to the three channels: EEG, EOG, and EMG.The entire duration of these recordings is segmented into epochs of 30 s.This study obtained two different testing schemes such as epoch-wise and subjectwise.The entire experiments were compiled and executed using MATLAB 2017b version.

Table 10
Features selected from EEG, EOG, and EMG signals fusions

Table 11
Overall accuracies of sleep staging using singleselected features with the Top 12 selected features of EEG signal

Table 12
Overall accuracies of sleep staging using singleselected features with Top 12 selected features of EOG signal

Table 13
Overall accuracies of sleep staging using singleselected features with the Top 12 selected features of EMG signal

Table 14
Experiments under different testing procedures

Table 16
Confusion matrix for 5C sleep staging using EEG + EOG + EMG on S-EDF dataset

Table 17
Confusion matrix of EEG + EOG + EMG for 5C sleep staging with PB-CAPSDB dataset SG1 and PB-CAPSDB datasets, respectively.The classification performance results for the five-class (5C) to two-class (2C) using ISRUC-SG1 S-EDF, PB-CAPSDB and S-EDF-78 datasets based on epoch-wise testing schemes are presented in Table

Table 18
Confusion matrix of EEG + EOG + EMG for 5C sleep staging with S-EDF-78 dataset

Table 19
Performance metrics values obtained for 5C sleep staging using an ISRUC-SG1 dataset with EEG + EOG + EMG

Table 20
Performance metrics values obtained for 5C sleep staging using an S-EDF dataset with EEG + EOG + EMG

Table 21
Performance metrics values obtained for 5C sleep staging using a PB-CAPSDB dataset with EEG + EOG + EMG ISRUC-SG1, S-EDF, PB-CAPSDB, S-EDF-78 based on subject-wise analysis are shown in Tables 24, 25, 26 and 27, respectively.Similarly, the proposed model's performance results based on the subject-wise testing procedure for all the datasets as mentioned earlier are presented in Tables 28, 29, 30, and 31 respectively.Finally, Table32presents the results for two-class (2C) to five-class (5C) sleep stages classification problems.

Table 22
Performance metrics values obtained for 5C sleep staging using a S-EDF-78 dataset with EEG + EOG + EMG

Table 23
Performance of accuracy and Cohen's kappa score with top 12 selected features for ISRUC-SG1, S-EDF, and PB-CAPSDB scored according to AASM guidelines using epoch-wise testing procedures

Table 25
Confusion matrix for Experiment 5 on S-EDF using EEG + EOG + EMG for 5C sleep staging

Table 26
Confusion matrix for Experiment 6 on PB-CAPSDB using EEG + EOG + EMG for 5C sleep staging

Table 28
Performance evaluation results for five-class sleep staging using ISRUC-SG1 with multi-modal signal fusions

Table 29
Performance evaluation results for five-class sleep staging using S-EDF with multi-modal signal fusions

Table 30
Performance evaluation results with multi-modal signal fusions for five-class sleep staging using PB-CAPSDB

Table 31
Performance evaluation results with multi-modal signal fusions for five-class sleep staging us-ing PB-CAPSDB

Table 32
Average overall accuracy and Cohen's kappa score with top 12 selected features for ISRUC-SG1, S-EDF, and PB-CAPSDB scored according to AASM guidelines using subject-wise testing procedures

Table 33
Performance evaluation results in between proposed studies with state-of-the-art based upon the signals used

Table 34
Performance comparisons based on the dataset obtained for the experiment

Table 35
Performance of ADB + RF model compared with the existing literature based on subject-dependent and subjectindependent testing

Subject-Independent Test (Subject-wise Test)
The time consumption in seconds at various stages of the proposed scheme over Sleep-EDF is as follows: For each recorded signal, the computation time for feature extraction is 0.018 s, for feature reduction is 0.005 s, and for classification is 0.0013 s.The overall computation time for 3000 epochs is approximately 7 min, which is considered sufficiently fast to meet real-time requirements.Feature extraction is recognized to be more time-consuming, suggesting potential for optimization.Nevertheless, by utilizing only the top 12 features per epoch, not only are storage costs reduced, but calculations are also simplified and scheme surpasses other comparable methods in terms of feature usage and accuracy.
this study, the chosen optimal features are inputted into highly robust, adaptable, and scalable classifiers such as AdaBoost with Random Forest as base classifiers.This approach directly contributes to enhancing classification accuracy.The entire experiments of this study were conducted through two testing procedures, epoch-wise and subject-wise.From the experimental results, it has been observed that the proposed methodology using multi-modal signal fusions is superior to other machine learning classification models with overall accuracies of 98.39%,97.21%,95.67%,and 94.30% using ISRUC-SG1, 98.10%, 97.02%, 95.09%, and 94.18% using S-EDF, 97.79%,96.69%,94.89%and 92.34% using PB-CAPSDB and 98.12%,97.01%,94.49%and 95.38% using S-EDF-78 for two-five classes respectively.Further, the proposed model reported accuracies of 98.23%, 96.89%, 94.42%, and 91.37% using ISRUC-SG1, 97.95%, 95.91%, 93.07%and 91.08% using S-EDF, 97.05%, 95.91%, 93.07%, and 91.55% using PB-CAPSDB and 98.10%,97.13%,95.05%, and 94.79% using S-EDF-78 for two-five classes respectively.We will extend our proposed work by integrating with different physiological signals, which can help us to detect more than one different type of sleep disorder simultaneously.